130 research outputs found
Improving Small Language Models on PubMedQA via Generative Data Augmentation
Large Language Models (LLMs) have made remarkable advancements in the field
of natural language processing. However, their increasing size poses challenges
in terms of computational cost. On the other hand, Small Language Models (SLMs)
are known for their efficiency, but they often struggle with limited capacity
and training data, especially in specific domains. In this paper, we introduce
a novel method aimed at improving SLMs in the medical domain using LLM-based
generative data augmentation. The objective of our approach is to develop more
efficient and capable models that are specifically tailored for specialized
applications. Through experiments conducted on the PubMedQA dataset, we
demonstrate the effectiveness of LLMs in refining and diversifying existing
question-answer pairs. This refinement process leads to improved performance in
a significantly smaller model after fine-tuning. Notably, our best SLM, with
under 1.6 billion parameters, outperforms the few-shot GPT-4 on the PubMedQA
dataset. Our code and generated data are publicly available to facilitate
further explorations
FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture
Neural Network (NN) accelerators with emerging ReRAM (resistive random access
memory) technologies have been investigated as one of the promising solutions
to address the \textit{memory wall} challenge, due to the unique capability of
\textit{processing-in-memory} within ReRAM-crossbar-based processing elements
(PEs). However, the high efficiency and high density advantages of ReRAM have
not been fully utilized due to the huge communication demands among PEs and the
overhead of peripheral circuits.
In this paper, we propose a full system stack solution, composed of a
reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and
its software system including neural synthesizer, temporal-to-spatial mapper,
and placement & routing. We highly leverage the software system to make the
hardware design compact and efficient. To satisfy the high-performance
communication demand, we optimize it with a reconfigurable routing architecture
and the placement & routing tool. To improve the computational density, we
greatly simplify the PE circuit with the spiking schema and then adopt neural
synthesizer to enable the high density computation-resources to support
different kinds of NN operations. In addition, we provide spiking memory blocks
(SMBs) and configurable logic blocks (CLBs) in hardware and leverage the
temporal-to-spatial mapper to utilize them to balance the storage and
computation requirements of NN. Owing to the end-to-end software system, we can
efficiently deploy existing deep neural networks to FPSA. Evaluations show
that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME,
the computational density of FPSA improves by 31x; for representative NNs, its
inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201
Using Multiple Instance Learning to Build Multimodal Representations
Image-text multimodal representation learning aligns data across modalities
and enables important medical applications, e.g., image classification, visual
grounding, and cross-modal retrieval. In this work, we establish a connection
between multimodal representation learning and multiple instance learning.
Based on this connection, we propose a generic framework for constructing
permutation-invariant score functions with many existing multimodal
representation learning approaches as special cases. Furthermore, we use the
framework to derive a novel contrastive learning approach and demonstrate that
our method achieves state-of-the-art results on a number of downstream tasks
DGI: Easy and Efficient Inference for GNNs
While many systems have been developed to train Graph Neural Networks (GNNs),
efficient model inference and evaluation remain to be addressed. For instance,
using the widely adopted node-wise approach, model evaluation can account for
up to 94% of the time in the end-to-end training process due to neighbor
explosion, which means that a node accesses its multi-hop neighbors. On the
other hand, layer-wise inference avoids the neighbor explosion problem by
conducting inference layer by layer such that the nodes only need their one-hop
neighbors in each layer. However, implementing layer-wise inference requires
substantial engineering efforts because users need to manually decompose a GNN
model into layers for computation and split workload into batches to fit into
device memory. In this paper, we develop Deep Graph Inference (DGI) -- a system
for easy and efficient GNN model inference, which automatically translates the
training code of a GNN model for layer-wise execution. DGI is general for
various GNN models and different kinds of inference requests, and supports
out-of-core execution on large graphs that cannot fit in CPU memory.
Experimental results show that DGI consistently outperforms layer-wise
inference across different datasets and hardware settings, and the speedup can
be over 1,000x.Comment: 10 pages, 10 figure
Sample-Specific Debiasing for Better Image-Text Models
Self-supervised representation learning on image-text data facilitates
crucial medical applications, such as image classification, visual grounding,
and cross-modal retrieval. One common approach involves contrasting
semantically similar (positive) and dissimilar (negative) pairs of data points.
Drawing negative samples uniformly from the training data set introduces false
negatives, i.e., samples that are treated as dissimilar but belong to the same
class. In healthcare data, the underlying class distribution is nonuniform,
implying that false negatives occur at a highly variable rate. To improve the
quality of learned representations, we develop a novel approach that corrects
for false negatives. Our method can be viewed as a variant of debiased
constrastive learning that uses estimated sample-specific class probabilities.
We provide theoretical analysis of the objective function and demonstrate the
proposed approach on both image and paired image-text data sets. Our
experiments demonstrate empirical advantages of sample-specific debiasing
- …